Skip to content

[TTI] Check type legalization of both src and result for fpto{u|s}i.sat. #147657

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jul 10, 2025

Conversation

ElvisWang123
Copy link
Contributor

For the cast instructions such ass fptoui.sat, fptosi.sat, need to check
both type of the source and the result type can be lowering legally. If
one of them is invalid, return invalid cost.

--
Fixes #142973.

For the cast instructions such ass fptoui.sat, fptosi.sat, need to check
both type of the source and the result type can be lowering legally. If
one of them is invalid, return invalid cost.
@llvmbot llvmbot added the llvm:analysis Includes value tracking, cost tables and constant folding label Jul 9, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 9, 2025

@llvm/pr-subscribers-llvm-analysis

Author: Elvis Wang (ElvisWang123)

Changes

For the cast instructions such ass fptoui.sat, fptosi.sat, need to check
both type of the source and the result type can be lowering legally. If
one of them is invalid, return invalid cost.

--
Fixes #142973.


Patch is 77.72 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/147657.diff

2 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/BasicTTIImpl.h (+8-4)
  • (added) llvm/test/Analysis/CostModel/RISCV/cast-sat.ll (+583)
diff --git a/llvm/include/llvm/CodeGen/BasicTTIImpl.h b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
index 2b9be43eadb7a..079a405be3eba 100644
--- a/llvm/include/llvm/CodeGen/BasicTTIImpl.h
+++ b/llvm/include/llvm/CodeGen/BasicTTIImpl.h
@@ -2486,11 +2486,15 @@ class BasicTTIImplBase : public TargetTransformInfoImplCRTPBase<T> {
       ISD = ISD::UMULO;
       break;
     case Intrinsic::fptosi_sat:
-      ISD = ISD::FP_TO_SINT_SAT;
-      break;
-    case Intrinsic::fptoui_sat:
-      ISD = ISD::FP_TO_UINT_SAT;
+    case Intrinsic::fptoui_sat: {
+      std::pair<InstructionCost, MVT> SrcLT = getTypeLegalizationCost(Tys[0]);
+      std::pair<InstructionCost, MVT> RetLT = getTypeLegalizationCost(RetTy);
+      if (!SrcLT.first.isValid() || !RetLT.first.isValid())
+        return InstructionCost::getInvalid();
+      ISD = IID == Intrinsic::fptosi_sat ? ISD::FP_TO_SINT_SAT
+                                         : ISD::FP_TO_UINT_SAT;
       break;
+    }
     case Intrinsic::ctpop:
       ISD = ISD::CTPOP;
       // In case of legalization use TCC_Expensive. This is cheaper than a
diff --git a/llvm/test/Analysis/CostModel/RISCV/cast-sat.ll b/llvm/test/Analysis/CostModel/RISCV/cast-sat.ll
new file mode 100644
index 0000000000000..2306d51c89118
--- /dev/null
+++ b/llvm/test/Analysis/CostModel/RISCV/cast-sat.ll
@@ -0,0 +1,583 @@
+; NOTE: Assertions have been autogenerated by utils/update_analyze_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -mtriple=riscv64 -mattr=+zve32f,+zvl128b,+f,+d,+zfh,+zvfh -passes="print<cost-model>" -cost-kind=throughput 2>&1 -disable-output | FileCheck %s --check-prefixes=CHECK,RV64ZVE32F
+; RUN: opt < %s -mtriple=riscv64 -mattr=+v,+zvl128b,+f,+d,+zfh,+zvfh -passes="print<cost-model>" -cost-kind=throughput 2>&1 -disable-output | FileCheck %s --check-prefixes=CHECK,RV64V
+
+define void @fptoui_sat() {
+; RV64ZVE32F-LABEL: 'fptoui_sat'
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v1f32_v1i8 = call <1 x i8> @llvm.fptoui.sat.v1i8.v1f32(<1 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v1f64_v1i8 = call <1 x i8> @llvm.fptoui.sat.v1i8.v1f64(<1 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v1f32_v1i16 = call <1 x i16> @llvm.fptoui.sat.v1i16.v1f32(<1 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v1f64_v1i16 = call <1 x i16> @llvm.fptoui.sat.v1i16.v1f64(<1 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v1f32_v1i32 = call <1 x i32> @llvm.fptoui.sat.v1i32.v1f32(<1 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v1f64_v1i32 = call <1 x i32> @llvm.fptoui.sat.v1i32.v1f64(<1 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v1f32_v1i64 = call <1 x i64> @llvm.fptoui.sat.v1i64.v1f32(<1 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v1f64_v1i64 = call <1 x i64> @llvm.fptoui.sat.v1i64.v1f64(<1 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v1f32_v1i1 = call <1 x i1> @llvm.fptoui.sat.v1i1.v1f32(<1 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v1f64_v1i1 = call <1 x i1> @llvm.fptoui.sat.v1i1.v1f64(<1 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v2f32_v2i8 = call <2 x i8> @llvm.fptoui.sat.v2i8.v2f32(<2 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v2f64_v2i8 = call <2 x i8> @llvm.fptoui.sat.v2i8.v2f64(<2 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v2f32_v2i16 = call <2 x i16> @llvm.fptoui.sat.v2i16.v2f32(<2 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v2f64_v2i16 = call <2 x i16> @llvm.fptoui.sat.v2i16.v2f64(<2 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v2f32_v2i32 = call <2 x i32> @llvm.fptoui.sat.v2i32.v2f32(<2 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v2f64_v2i32 = call <2 x i32> @llvm.fptoui.sat.v2i32.v2f64(<2 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %v2f32_v2i64 = call <2 x i64> @llvm.fptoui.sat.v2i64.v2f32(<2 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 4 for instruction: %v2f64_v2i64 = call <2 x i64> @llvm.fptoui.sat.v2i64.v2f64(<2 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v2f32_v2i1 = call <2 x i1> @llvm.fptoui.sat.v2i1.v2f32(<2 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v2f64_v2i1 = call <2 x i1> @llvm.fptoui.sat.v2i1.v2f64(<2 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v4f32_v4i8 = call <4 x i8> @llvm.fptoui.sat.v4i8.v4f32(<4 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v4f64_v4i8 = call <4 x i8> @llvm.fptoui.sat.v4i8.v4f64(<4 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v4f32_v4i16 = call <4 x i16> @llvm.fptoui.sat.v4i16.v4f32(<4 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v4f64_v4i16 = call <4 x i16> @llvm.fptoui.sat.v4i16.v4f64(<4 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v4f32_v4i32 = call <4 x i32> @llvm.fptoui.sat.v4i32.v4f32(<4 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v4f64_v4i32 = call <4 x i32> @llvm.fptoui.sat.v4i32.v4f64(<4 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %v4f32_v4i64 = call <4 x i64> @llvm.fptoui.sat.v4i64.v4f32(<4 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 8 for instruction: %v4f64_v4i64 = call <4 x i64> @llvm.fptoui.sat.v4i64.v4f64(<4 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v4f32_v4i1 = call <4 x i1> @llvm.fptoui.sat.v4i1.v4f32(<4 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v4f64_v4i1 = call <4 x i1> @llvm.fptoui.sat.v4i1.v4f64(<4 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v8f32_v8i8 = call <8 x i8> @llvm.fptoui.sat.v8i8.v8f32(<8 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v8f64_v8i8 = call <8 x i8> @llvm.fptoui.sat.v8i8.v8f64(<8 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v8f32_v8i16 = call <8 x i16> @llvm.fptoui.sat.v8i16.v8f32(<8 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v8f64_v8i16 = call <8 x i16> @llvm.fptoui.sat.v8i16.v8f64(<8 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v8f32_v8i32 = call <8 x i32> @llvm.fptoui.sat.v8i32.v8f32(<8 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v8f64_v8i32 = call <8 x i32> @llvm.fptoui.sat.v8i32.v8f64(<8 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %v8f32_v8i64 = call <8 x i64> @llvm.fptoui.sat.v8i64.v8f32(<8 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 16 for instruction: %v8f64_v8i64 = call <8 x i64> @llvm.fptoui.sat.v8i64.v8f64(<8 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v8f32_v8i1 = call <8 x i1> @llvm.fptoui.sat.v8i1.v8f32(<8 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v8f64_v8i1 = call <8 x i1> @llvm.fptoui.sat.v8i1.v8f64(<8 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nxv1f32_nxv1i8 = call <vscale x 1 x i8> @llvm.fptoui.sat.nxv1i8.nxv1f32(<vscale x 1 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv1f64_nxv1i8 = call <vscale x 1 x i8> @llvm.fptoui.sat.nxv1i8.nxv1f64(<vscale x 1 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nxv1f32_nxv1i16 = call <vscale x 1 x i16> @llvm.fptoui.sat.nxv1i16.nxv1f32(<vscale x 1 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv1f64_nxv1i16 = call <vscale x 1 x i16> @llvm.fptoui.sat.nxv1i16.nxv1f64(<vscale x 1 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nxv1f32_nxv1i32 = call <vscale x 1 x i32> @llvm.fptoui.sat.nxv1i32.nxv1f32(<vscale x 1 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv1f64_nxv1i32 = call <vscale x 1 x i32> @llvm.fptoui.sat.nxv1i32.nxv1f64(<vscale x 1 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv1f32_nxv1i64 = call <vscale x 1 x i64> @llvm.fptoui.sat.nxv1i64.nxv1f32(<vscale x 1 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv1f64_nxv1i64 = call <vscale x 1 x i64> @llvm.fptoui.sat.nxv1i64.nxv1f64(<vscale x 1 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nxv1f32_nxv1i1 = call <vscale x 1 x i1> @llvm.fptoui.sat.nxv1i1.nxv1f32(<vscale x 1 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv1f64_nxv1i1 = call <vscale x 1 x i1> @llvm.fptoui.sat.nxv1i1.nxv1f64(<vscale x 1 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nxv2f32_nxv2i8 = call <vscale x 2 x i8> @llvm.fptoui.sat.nxv2i8.nxv2f32(<vscale x 2 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv2f64_nxv2i8 = call <vscale x 2 x i8> @llvm.fptoui.sat.nxv2i8.nxv2f64(<vscale x 2 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nxv2f32_nxv2i16 = call <vscale x 2 x i16> @llvm.fptoui.sat.nxv2i16.nxv2f32(<vscale x 2 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv2f64_nxv2i16 = call <vscale x 2 x i16> @llvm.fptoui.sat.nxv2i16.nxv2f64(<vscale x 2 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nxv2f32_nxv2i32 = call <vscale x 2 x i32> @llvm.fptoui.sat.nxv2i32.nxv2f32(<vscale x 2 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv2f64_nxv2i32 = call <vscale x 2 x i32> @llvm.fptoui.sat.nxv2i32.nxv2f64(<vscale x 2 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv2f32_nxv2i64 = call <vscale x 2 x i64> @llvm.fptoui.sat.nxv2i64.nxv2f32(<vscale x 2 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv2f64_nxv2i64 = call <vscale x 2 x i64> @llvm.fptoui.sat.nxv2i64.nxv2f64(<vscale x 2 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nxv2f32_nxv2i1 = call <vscale x 2 x i1> @llvm.fptoui.sat.nxv2i1.nxv2f32(<vscale x 2 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv2f64_nxv2i1 = call <vscale x 2 x i1> @llvm.fptoui.sat.nxv2i1.nxv2f64(<vscale x 2 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nxv4f32_nxv4i8 = call <vscale x 4 x i8> @llvm.fptoui.sat.nxv4i8.nxv4f32(<vscale x 4 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv4f64_nxv4i8 = call <vscale x 4 x i8> @llvm.fptoui.sat.nxv4i8.nxv4f64(<vscale x 4 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nxv4f32_nxv4i16 = call <vscale x 4 x i16> @llvm.fptoui.sat.nxv4i16.nxv4f32(<vscale x 4 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv4f64_nxv4i16 = call <vscale x 4 x i16> @llvm.fptoui.sat.nxv4i16.nxv4f64(<vscale x 4 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nxv4f32_nxv4i32 = call <vscale x 4 x i32> @llvm.fptoui.sat.nxv4i32.nxv4f32(<vscale x 4 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv4f64_nxv4i32 = call <vscale x 4 x i32> @llvm.fptoui.sat.nxv4i32.nxv4f64(<vscale x 4 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv4f32_nxv4i64 = call <vscale x 4 x i64> @llvm.fptoui.sat.nxv4i64.nxv4f32(<vscale x 4 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv4f64_nxv4i64 = call <vscale x 4 x i64> @llvm.fptoui.sat.nxv4i64.nxv4f64(<vscale x 4 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nxv4f32_nxv4i1 = call <vscale x 4 x i1> @llvm.fptoui.sat.nxv4i1.nxv4f32(<vscale x 4 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv4f64_nxv4i1 = call <vscale x 4 x i1> @llvm.fptoui.sat.nxv4i1.nxv4f64(<vscale x 4 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nxv8f32_nxv8i8 = call <vscale x 8 x i8> @llvm.fptoui.sat.nxv8i8.nxv8f32(<vscale x 8 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv8f64_nxv8i8 = call <vscale x 8 x i8> @llvm.fptoui.sat.nxv8i8.nxv8f64(<vscale x 8 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nxv8f32_nxv8i16 = call <vscale x 8 x i16> @llvm.fptoui.sat.nxv8i16.nxv8f32(<vscale x 8 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv8f64_nxv8i16 = call <vscale x 8 x i16> @llvm.fptoui.sat.nxv8i16.nxv8f64(<vscale x 8 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nxv8f32_nxv8i32 = call <vscale x 8 x i32> @llvm.fptoui.sat.nxv8i32.nxv8f32(<vscale x 8 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv8f64_nxv8i32 = call <vscale x 8 x i32> @llvm.fptoui.sat.nxv8i32.nxv8f64(<vscale x 8 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv8f32_nxv8i64 = call <vscale x 8 x i64> @llvm.fptoui.sat.nxv8i64.nxv8f32(<vscale x 8 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv8f64_nxv8i64 = call <vscale x 8 x i64> @llvm.fptoui.sat.nxv8i64.nxv8f64(<vscale x 8 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nxv8f32_nxv8i1 = call <vscale x 8 x i1> @llvm.fptoui.sat.nxv8i1.nxv8f32(<vscale x 8 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv8f64_nxv8i1 = call <vscale x 8 x i1> @llvm.fptoui.sat.nxv8i1.nxv8f64(<vscale x 8 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nxv16f32_nxv16i8 = call <vscale x 16 x i8> @llvm.fptoui.sat.nxv16i8.nxv16f32(<vscale x 16 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv16f64_nxv16i8 = call <vscale x 16 x i8> @llvm.fptoui.sat.nxv16i8.nxv16f64(<vscale x 16 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nxv16f32_nxv16i16 = call <vscale x 16 x i16> @llvm.fptoui.sat.nxv16i16.nxv16f32(<vscale x 16 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv16f64_nxv16i16 = call <vscale x 16 x i16> @llvm.fptoui.sat.nxv16i16.nxv16f64(<vscale x 16 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nxv16f32_nxv16i32 = call <vscale x 16 x i32> @llvm.fptoui.sat.nxv16i32.nxv16f32(<vscale x 16 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv16f64_nxv16i32 = call <vscale x 16 x i32> @llvm.fptoui.sat.nxv16i32.nxv16f64(<vscale x 16 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv16f32_nxv16i64 = call <vscale x 16 x i64> @llvm.fptoui.sat.nxv16i64.nxv16f32(<vscale x 16 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv16f64_nxv16i64 = call <vscale x 16 x i64> @llvm.fptoui.sat.nxv16i64.nxv16f64(<vscale x 16 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %nxv16f32_nxv16i1 = call <vscale x 16 x i1> @llvm.fptoui.sat.nxv16i1.nxv16f32(<vscale x 16 x float> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Invalid cost for instruction: %nxv16f64_nxv16i1 = call <vscale x 16 x i1> @llvm.fptoui.sat.nxv16i1.nxv16f64(<vscale x 16 x double> undef)
+; RV64ZVE32F-NEXT:  Cost Model: Found an estimated cost of 0 for instruction: ret void
+;
+; RV64V-LABEL: 'fptoui_sat'
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v1f32_v1i8 = call <1 x i8> @llvm.fptoui.sat.v1i8.v1f32(<1 x float> undef)
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v1f64_v1i8 = call <1 x i8> @llvm.fptoui.sat.v1i8.v1f64(<1 x double> undef)
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v1f32_v1i16 = call <1 x i16> @llvm.fptoui.sat.v1i16.v1f32(<1 x float> undef)
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v1f64_v1i16 = call <1 x i16> @llvm.fptoui.sat.v1i16.v1f64(<1 x double> undef)
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v1f32_v1i32 = call <1 x i32> @llvm.fptoui.sat.v1i32.v1f32(<1 x float> undef)
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v1f64_v1i32 = call <1 x i32> @llvm.fptoui.sat.v1i32.v1f64(<1 x double> undef)
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v1f32_v1i64 = call <1 x i64> @llvm.fptoui.sat.v1i64.v1f32(<1 x float> undef)
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v1f64_v1i64 = call <1 x i64> @llvm.fptoui.sat.v1i64.v1f64(<1 x double> undef)
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v1f32_v1i1 = call <1 x i1> @llvm.fptoui.sat.v1i1.v1f32(<1 x float> undef)
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v1f64_v1i1 = call <1 x i1> @llvm.fptoui.sat.v1i1.v1f64(<1 x double> undef)
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v2f32_v2i8 = call <2 x i8> @llvm.fptoui.sat.v2i8.v2f32(<2 x float> undef)
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v2f64_v2i8 = call <2 x i8> @llvm.fptoui.sat.v2i8.v2f64(<2 x double> undef)
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v2f32_v2i16 = call <2 x i16> @llvm.fptoui.sat.v2i16.v2f32(<2 x float> undef)
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v2f64_v2i16 = call <2 x i16> @llvm.fptoui.sat.v2i16.v2f64(<2 x double> undef)
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v2f32_v2i32 = call <2 x i32> @llvm.fptoui.sat.v2i32.v2f32(<2 x float> undef)
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v2f64_v2i32 = call <2 x i32> @llvm.fptoui.sat.v2i32.v2f64(<2 x double> undef)
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v2f32_v2i64 = call <2 x i64> @llvm.fptoui.sat.v2i64.v2f32(<2 x float> undef)
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v2f64_v2i64 = call <2 x i64> @llvm.fptoui.sat.v2i64.v2f64(<2 x double> undef)
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v2f32_v2i1 = call <2 x i1> @llvm.fptoui.sat.v2i1.v2f32(<2 x float> undef)
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v2f64_v2i1 = call <2 x i1> @llvm.fptoui.sat.v2i1.v2f64(<2 x double> undef)
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v4f32_v4i8 = call <4 x i8> @llvm.fptoui.sat.v4i8.v4f32(<4 x float> undef)
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v4f64_v4i8 = call <4 x i8> @llvm.fptoui.sat.v4i8.v4f64(<4 x double> undef)
+; RV64V-NEXT:  Cost Model: Found an estimated cost of 2 for instruction: %v4f32_v4i16 = call <4 x i16> @llvm.fptoui.sat.v4i16.v4f32(<4 x float> undef)
+; RV64V-NEXT:  Cost Model: Found an est...
[truncated]

Copy link

github-actions bot commented Jul 9, 2025

✅ With the latest revision this PR passed the undef deprecator.

Copy link
Collaborator

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment explaining why it needs to check the two types. (It might be able to get away with just checking the Tys, RetTy will be checked later? It sounds OK to keep both checks though, to be safe).

Otherwise LGTM.

@topperc
Copy link
Collaborator

topperc commented Jul 10, 2025

This fixes the case where one or both of the types is invalid. But don't we need to take the maximum of the two LT.second values when both types are valid but one them splits more than the other?

Co-authored-by: Craig Topper <craig.topper@sifive.com>
@ElvisWang123
Copy link
Contributor Author

This fixes the case where one or both of the types is invalid. But don't we need to take the maximum of the two LT.second values when both types are valid but one them splits more than the other?

I think would be better to leave this into follow-up PR.

Copy link
Collaborator

@topperc topperc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@davemgreen
Copy link
Collaborator

This fixes the case where one or both of the types is invalid. But don't we need to take the maximum of the two LT.second values when both types are valid but one them splits more than the other?

Perhaps (in a followup) we could say that custom lowering for fptosi.sat isn't meaningful enough (only based on RetTy) to be useful and should always use the fallback costs (line 2758-2783). The target can override it based on the extra information it knows.

@ElvisWang123
Copy link
Contributor Author

Perhaps (in a followup) we could say that custom lowering for fptosi.sat isn't meaningful enough (only based on RetTy) to be useful and should always use the fallback costs (line 2758-2783). The target can override it based on the extra information it knows.

Sorry I am not sure what is custom lowering. Is that the expansion cost (Line 2533 - 2551)? https://github.com/llvm/llvm-project/pull/147657/files#diff-3bd7f3712428683f95ff47b2f16b54ab32f478d036deb2f6463a5560bc97712bR2533-R2551

@davemgreen
Copy link
Collaborator

I might be wrong about what is going on, but yes I assumed that the backends marking those opcodes as Custom (

setOperationAction(ISD::FP_TO_SINT_SAT, VT, Custom);
, https://github.com/llvm/llvm-project/blob/20becf373edcf9d568f8904c2b473e6b48500787/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp#L2060,
setOperationAction({ISD::FP_TO_SINT_SAT, ISD::FP_TO_UINT_SAT}, VT,
) makes the cost model use the isOperationCustom cost of 2 in the code you pointed to. That might not be very useful for a lot of types though (it doesn't consider the Tys[0], only RetTy), and maybe for this particular operation it isn't going to be correct enough to be useful.

@ElvisWang123
Copy link
Contributor Author

Thanks for information! 😃

Will try to use fallback implementation or improve the cost of custom lowering in the followup patch.

@ElvisWang123 ElvisWang123 merged commit 2137354 into llvm:main Jul 10, 2025
9 checks passed
@ElvisWang123 ElvisWang123 deleted the fix-cast-sat-invalid-basic branch July 10, 2025 06:44
std::pair<InstructionCost, MVT> RetLT = getTypeLegalizationCost(RetTy);

// For cast instructions, types are different between source and
// destination. Also need to check if the source type can be legalize.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand this, any type should be legalizable. Why would this ever return an invalid cost?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the vector type might not be legalize if the target not support.

For example, fptoui.sat(<vscale x 1 x double> ...) is illegal type under riscv64, +zve32f (target only support f32 vectors no f64 vector support) and cannot be scalarize since the element counts is unknown at compile time. So the cost of fptoui.sat(<vscale x 1 x double> ...) should be invalid.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
llvm:analysis Includes value tracking, cost tables and constant folding
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[RISCV] Incorrect cost for fptosi_sat.v8i16.v8f64 with Zve32f
6 participants